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Abstract 

We associate with a Bienayme-Galton- Watson branching process a family tree rooted at 
the ancestor. For a positive integer N, define a complete A-ary tree to be the family tree 
of a deterministic branching process with offspring generating function s N . We study the 
random variables Vn,h and Vjv counting the number of disjoint complete A-ary subtrees, 
rooted at the ancestor, and having height n and oo, respectively. Dekking (1991) and Pakes 
and Dekking (1991) find recursive relations for P{Vn,u > 0) and P(Vn > 0) involving the 
offspring probability generation function (pgf) and its derivatives. We extend their results 
determining the probability distributions of Vjv,„ and Vn. It turns out that they can be 
expressed in terms of the offspring pgf, its derivatives, and the above probabilities. We 
show how the general results simplify in case of fractional linear, geometric, Poisson, and 
one-or-many offspring laws. 
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1 Introduction and main results 

Consider the family tree associated with a Bienayme-Galton- Watson process with the following 
simple reproduction rules. At generation zero, the process starts with single ancestor called 
root of the tree. Then each individual in the population has, independently of the others, a 
random number X of children distributed according to the offspring distribution with probability 
generating function (pgf) 

oo 

/(*) = ]>> fcS fc , 

A:=0 
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satisfying /(l) = 1. Further on we adopt the well-known construction of a family tree generated 
by a simple branching process where the individuals are the nodes and the parent-child relations 
define the arcs of the tree in the following manner, see e.g. Harris (1963), Ch.7. Let the ith 
child of the ancestor be (i) and in general (iii2 ■ ■ ■ ik-iik) denotes the ifcth child of [i\%2 ■ ■ ■ ik-i)- 
Then, a directed arc is assumed to emanate from (i±i2 ■ ■ ■ ik-i) to (iit2 ■ ■ ■ ik-iik)- Since, in our 
case, the children appear simultaneously, we suppose that the ordering is performed by a chance 
device independently of the evolution in the process. This scheme produces family trees (also 
called rooted ordered trees) in which the nodes of height (also known as depth) n (n > 0) have 
labels (1112 ■ ■ ■ i n ), with the ancestor (root) having height 0. The height of a subtree equals the 
maximum height of its nodes. 

For fixed integer N > 1, define a complete infinite iV-ary tree to be the family tree of 
a deterministic branching process with offspring pgf /(s) = s N . Further on we will consider 
rooted subtrees of a family tree. Two such subtrees are called disjoint if they do not have 
a common node different from the root. These kinds of trees appear, for example, in some 
computer algorithms; for more details see Knuth (1997). 

Let {Z n : n > 1; Zq = 1} denote the generation size process, and let Tjy — 1 be the height 
of a complete iV-ary subtree rooted in the ancestor; Tjv = if Z\ < N. Notice that T\ is the 
extinction time of {Z n }. The study of the probability tn = lim n ^oo P(Tn > n) that a Bienayme- 
Galton- Watson tree contains an infinite complete iV-ary subtree was initiated by Dekking (1991) 
who considered complete binary (N = 2) subtrees. The general (N > 2) case was subsequently 
investigated in detail by Pakes and Dekking (1991). In particular, they encountered the following 
phenomenon: if N > 2, then there is a critical value m c N for the offspring mean m = f'(l) such 
that ttv = if m < m c N and tn > if m > m c N . This is qualitatively different from what 
happens for N = 1 where the probability for non-extinction t\ = if m = mf = 1, except for 
the trivial case where f(s) = s. Our work is motivated by the results of Pakes and Dekking 
(1991). 

We introduce the random variable Vn to be the number of disjoint complete iV-ary subtrees 
with infinite height, rooted at the ancestor of a Bienayme-Galton- Watson family tree. Clearly 
ttv = P(Vn > 0). As usual, we assume for the offspring distribution {pk}"kLo that pk < 1 for all 
k and pk > for some k > N. Let M be the set of all positive integers and denote for x, y > 
and any j = 0, 1, . . . 

3N+N-1 k 
G N (x,y;j)= ]T ^/(%). 

k=jN K - 

Pakes and Dekking (1991) showed that P(Vn = 0) = 1 — tn, where 1 — tn is the smallest 
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solution in [0, 1] of the equation 

x = G N (l-x,x;0). (1) 

Our goal is to study the distribution of Vn- As the following result shows, the probability 
mass function (pmf) of Vn can be obtained using the Taylor expansion of /(l) about the point 

1 - T N . 

Theorem 1 If N € J\f then for any j = 0, 1, ... 

P(V N = j) = G N (r N ,l-r N] j) (2) 

and P(Vn = 0) = 1 — tn is the smallest solution in [0, 1] of 

Remark (i) If N = 1, then obviously P(V\ = 0) = 1 — r\ = q is the extinction probability of 
the Galton- Watson process. Now, ((2J) becomes 

P(Vi=j)= { -^-j^f^(q), j = 0,l,..., 

which in turn implies that E(s Vl ) = f(q + (1 — q)s). This identity follows directly observing 
that the number of distinct infinite unary trees is equal to the number of first generation nodes 
having infinite line of descent. 1 

(ii) Also note that a sufficient condition for P(Vn = 0) < 1 is given in Pakes and Dekking 
(1991), Theorem 3. In particular, they show that P(V N = 0) < 1 (N > 2) if 

N-l 

2^E-f T <a-E^) 2 - 

j>N J ^ L j=0 

The number of complete iV-ary subtrees is a measure for the rate of growth (or fertility) 
of the branching process. In fact, as was pointed out in Dekking (1991), if P(V2 > 0) > 
then we can say that the branching process grows faster than binary splitting. In the study 
of the tree structure of branching processes, an important role is played by the process' total 
progeny. Denote by v n the number of individuals who existed in the first n + 1 generations, 
i.e., v n = 1 + Z\ + . . . + Z n , n = 1, 2, . . .. Obviously, v n equals the total number of nodes 
having height less than or equal to n. Let us also define the random variable Vn, n to be the 
number of disjoint complete iV-ary subtrees of height at least n rooted at the ancestor of a 
Bienayme-Galton- Watson family tree. Let 

^N,n(s) = E{s^;V N , n >0) and cj) N ^ s ) = E(s u ";V N ,n = 0)- (3) 

The following result presents a recursive relation for the joint distribution of Vn,ti and v n . 

1 The authors are indebted to the referee who pointed out this argument. It implies immediately the result of 
Theorem 1 for unary trees. 
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Theorem 2 If N € M then for | s |< 1 and any j = 0, 1, ... 

E{s Vn + 1 -V N:n+l = j) = sGjv(V-7V,n(s),<AjV,n(s);j) • (4) 

Notice that, if N = 1 and j = 0, then the above recurrence reduces to the well-known 
E{s Vn + 1 ;Z n+1 = 0) = sf(E(s Un ;Z n = 0)), see e.g. Kolchin (1986), p. 120. 

Applications of complete iV-ary trees can be found in the analysis of algorithms, see Knuth 
(1997). Problems of this nature appear also in percolation theory. For instance, Pakes and 
Dekking (1991) point out a relationship between the model of N-ary complete and infinite 
subtrees and a construction employed by Chayes et al. (1988) in their study of Mandelbrot's 
percolation processes. The existence of iV-ary subtrees is also used by Pemantle (1988) in intro- 
ducing the concept of a iV-infinite branching process. Let us also mention potential connections 
with problems of percolation of binary words on the nodes of locally finite graphs with countably 
infinite node-sets, see Benjamini and Kesten (1995). 

We organize our paper as follows. In Section 2 we prove the main results. Sections 3-5 
contain some illustrations. In Section 3 we consider the family tree generated by the fractional 
linear /(s) as well as the special case of geometric offspring. In the latter case, Vn itself follows a 
geometric distribution. It turns out that in the Poisson offspring case, given in Section 4, the pmf 
of V/v can be expressed in terms of certain Poisson probabilities. Note that the critical values 
77i jy (N > 2) in the Poisson case are less than those in the geometric one. Finally, in Section 5 
we consider the one-or-many (i.e., concentrated on two points only) offspring distribution. In 
this case Vn has a pmf given in terms of binomial probabilities. 



2 Proofs of the Theorems 

Proof of Theorem 1 Let us consider P(Vn = j) where j = 1, 2, . . . Recall that the random 
variable Vn, n equals the number of disjoint complete iV-ary subtrees of height n rooted at the 
ancestor of a Bienayme-Galton- Watson family tree. First, we will find the pmf of Vn, n +i using 
the total probability formula. Indeed, to have j disjoint complete iV-ary subtrees rooted at the 
ancestor node there must be jN + k (k > 0) nodes in the first generation. Each of these nodes 
can be considered as an ancestor of a family tree rooted at the first generation. Consider the 
event An (I) = {jN + / of the Z\ first generation nodes are ancestors of at least one complete 
A^-ary tree of height ti}, where I = 0,1, ... , mm{k, N — 1}. If Z\ = jN + k then for fixed / the 
event An{1) has conditional probability 

P(A N (l)\Zi = jN + k)= (j^f) (t n , n) ]N+l {l - r N , n?- 1 (0 < I < min{k, N - 1}), 
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where rjv, n = 1 — PfViv, n = 0) and by convention let tjv, o = 1- We have 

(min{fc, AT— 1} \ min{fc, N-l} / • » r .\ 

Applying the total probability formula and changing the order of summation, we obtain 

oo /min{fc, N-l} 

P(V N>n+1 =j) = J^P(Z 1 =jN + k) Pi [J Aat(7) I Zi = jW + A: 
fc=0 \ z=o 

rmin{fc, iV-1} / - N , k \ 

iV-l r iJV+' oo 
N-l 

= Gn(tn,ti, 1 — TN,n', j)- 

By definition tn, = 1 an d tjv, n { tn as n f 00. Letting n — > 00, we obtain for j > 1 
-P(V/V = j) = lim P{V N> n+ i = j) = G n (t n , 1 - tat; j). 

n — >oo 

Let us now consider the case j = 0. The above recurrence is true for n = 0, i.e., P(Vn,i = 0) = 
Gjv(l,0;0) = Y^k=o Pk- For n > 1, using the total probability formula and an argument similar 
to that for the case j > 1, we obtain 

N-l 00 

III. I 



p(v N , n+1 = o) = EEftljM't 1 -^"' 

1=0 k=l 



N-l 1 \l 

E \jN,n) -m n x 
7j jT ; U - TAT, n ) 

«=o 

= GN(TN,nA — TN,n,0). (5) 

Computing the derivative of Gn (x, 1 — x; 0), we get a telescoping sum which after cancelations 
becomes dG N (x, 1 - x; 0)/dx = (1 - x) 1 *' 1 f (N \x)/(N - 1)! > for < x < 1. Thus, G N (x,l- 
x;0) is non-decreasing in [0,1], and therefore 

1 - t n = lim (1 - T Nn+1 ) = lim P(V Nn+ i = 0) = G n (t n , 1 - r^; 0) 

n— »oo n— »oo 

is the smallest root in [0, 1] of the equation x = Gjv(l — x; 0). The proof is complete. 
Clearly © implies that ET=o P ( V N = j) = E£°=o^/ (fc) (l " r N )/k\ = f{l) = 1. 
Proof of Theorem 2 Let us introduce the notation 

TN,n(t) = P(V N)n >0,U n =t), "/N,n(t) = P{V N>n = 0, V n = t) = P(v n = t) - T N ^ n {t), 



5 



where N, n, and t are positive integers. Proceeding as in the proof of Theorem 1, we consider 
the event 

A N (l,t) = A N (l)f]{u n+1 =t}, 

where An(1) is defined in the proof of Theorem 1. For fixed t and I (0 < I < min(k,N — 1)), 
using the fact that all trees rooted in the first generation grow independently, we compute the 
conditional probability of An(1, t) given Z\ = jN + k to be 



P(A N (l,t)\Z 1 =jN + k) 



f jN + k" 
JN + l 



jN+l jN+k 

E' II T N,n( n u) II lN,n{ n v), 
u=l v=jN+l+l 

where the summation in J2' 1S over au nonnegative integers {Vii}| =1 + such that J2i=i~ k n i = t — 1. 
Then, the total probability formula implies that 

oo min(fe,JV-l) 

P{V N>n+1 = j, v n+l =t) = ]T P(Z 1 = jN + k) Yl P(An(1, t)\Z 1= jN + k) 

k=0 1=0 



N-l oo 

E z2pjN+k 

1=0 k=l 



( jN + k s 
[jN + l, 



jN+l jN+k 

E' II T N,n(n u ) Y[ 7JV,n( 
u=l v =jN+l+l 



Multiplying both sides of this equality by s* and summing over t, we get 

N-l i oo 

E(s^;V N>n+1 =j) = s J2 {jN - /); T,PjN + k(jN + k)(JN + k - l)-(fc - I + 1) 

oo jJV+Z jiV+fc 

X EE' I [ T N,n{n u ) J~j 7Af,n(^)s* _1 - 



f=l n =l 

Observe that the coefficient of s i_1 in the series 

oo jN+l jN+k 



v=jN+l+l 



ee n r A r ,"( n «) n ^N,n{n v )s t 1 



t=l 



«=1 



can be written as 

t-1 jN+l jN+k 

E I! II T N)n {n u ) II 7N,n(n v ). 

h=0 ni+...+rij N+ i=h u=l nj N+ i +1 +...+rtj N+k =t-l-h v=jN+l+l 

The rule of multiplying power series implies that this coefficient equals the coefficient of s*^ 1 in 
the power series expansion of 

k-l 

= liJN,n(s)} jN+l ttN,n(s)} k ~ l , 



"00 


jN+l 


00 


.i=l 




E^nCOs 1 
.i=l 



where ipN,n(s) and 4>N,n(s) are defined in ©. Therefore, 

iV-l 



* E ^y + i)! E^Wj^ + + k - !)...(* - I + 1)[^,„ («)] 



fe-i 
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which coincides with the right-hand side of Ijljl. This completes the proof. 



3 Fractional linear offspring 

Let f(s) be a fractional linear pgf given by 

/(.)_!- ' + *!- (6) 
1 — p 1 — ps 

and the parameter space {{p, 6) : < p < 1, < 6 < 1 — p}. Then the offspring distribution is 
given by the geometric series p^ = bp k ~ l , k = 1, 2, . . . ;po = 1 — J2h=iPk an d the offspring mean 
is m = 6/(1 —p) 2 - In the particular case b = p(l — p) we have p^ = (1 — p)p k , k > which is the 
standard geometric distribution with pgf /(s) = (1 — p)/(l — ps). It can be verified, see Pakes 
and Dekking (1991), p. 361 if N > 2 and Harris (1963), p. 9 if N = 1, that for N G N 

1 -p(l - tat) = [6/(1 -p^Iptn} 1 - 1 /". (7) 

Proposition 1 If the offspring distribution has the fractional linear pgf (|6|). then V/v follows 
a zero- modified geometric (i.e., fractional linear) distribution given by 

P(yiY = J ' ) = RT^) (1 "^ p{Vn = 0) = 1 " pJT^p) 6n (8) 

and 
where 

\ iV 

Cat 



,1 -p(l - Tat) 
and tat is the largest solution in [0, 1] of (JJJ). 

Proof Since f®(s) = i\bp l ~ l /(l — ps)' L+1 (i > 1), we have from © for j > 1 
1 N 3) to U N + W C 1 " ^ " ^v)) iAW1 

(1 - p(l - TA,))^ 1 ^ (1 " P (l ~ T N )) k ■ 

Now, setting (On) 1 ^ = P t n/(^ — p(l — r A r )) one can obtain the first formula in (|5|>. which in 
turn leads to © and @. 

Corollary If the offspring distribution is geometric, i.e., pk = (1 —p)p k , k>0, then Vn is 
geometric as well, P(Vn = j) = (1 — t~n)t n {j > 0) and -EVjv = tjv(1 — tat) -1 , where tat is the 
largest solution in [0, 1] of (t n + l/m) N = t^ 1 (N > 1). 
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Proof In the case of geometric offspring © holds with b = p(l—p) and m = p/(l —p)- The 
equation for tn follows by inspection from ([7)l. It is also given in Pakes and Dekking (1991), 
p. 361 if N > 2. Simple algebraic manipulations show that this equation simplifies to 9n = t~n- 
Now, the rest of the statement follows from (jSJ and Q. 

Remark For geometric offspring with mean m > 1 we have P{V\ = j) = (l/m)(l — l/m) J 
and EV\ = m — 1. In particular, P{V\ = 0) = 1/m which equals the probability of extinction, 
see Harris (1963), p. 9. 

Table 1 lists the probabilities P(V N = j), j = 0, 1, 2, ... 9 as well as EV N for 1 < N < 5. 
The critical mean values (see Section 1) are as follows: m\ = 1, = 4, m-j = 6.75, m| = 9.481, 
m§ = 12.207. The expected values in the last column provide a measure of how many iV-ary 
subtrees (1 < N < 5) are supported by the geometric family tree with offspring mean fixed to 
be m = 13. See also Table 2 below for a comparison with the Poisson offspring case. 



v N = 





1 


2 


3 


4 


5 


6 


7 


8 


9 


>10 


E(V N ) 


N = 1 


0.08 


0.07 


0.07 


0.06 


0.06 


0.05 


0.05 


0.04 


0.04 


0.04 


0.44 


12 


N = 2 


0.16 


0.14 


0.11 


0.10 


0.08 


0.07 


0.06 


0.05 


0.04 


0.03 


0.16 


5.22 


N = 3 


0.26 


0.19 


0.14 


0.11 


0.08 


0.06 


0.04 


0.03 


0.02 


0.02 


0.05 


2.91 


N = 4 


0.37 


0.23 


0.15 


0.09 


0.06 


0.04 


0.02 


0.01 


0.01 


0.01 


0.01 


1.71 


N = 5 


0.53 


0.25 


0.12 


0.05 


0.03 


0.01 


0.01 














0.87 



Table 1: Probability distribution of Vjv assuming geometric offspring with m = 13. 



4 Poisson offspring 

Consider the case of Poisson offspring distribution with pgf given by 

f( s ) =e m(s-l) ( m>0 ). (10) 

Then, the probability r^v is the largest solution of 

N-i 

(1 - s)e ms = ]T (msY/jl (11) 
3=0 

(see Pakes and Dekking (1991), p. 364). Since f®(s) = m % e m ^ s ~ l \i > 0), formula © becomes 



P{ v N = ]) = e -^y ^' N+k i> 



Therefore, we have the following 

Proposition 2 If the offspring distribution has the Poisson pgf (jlOj) . then 

P(Vn = j) = P(jN <Y N <jN + N-l), 

where Y/v has the Poisson pmf 

P(Y N = k) = {mT N ) k e- mTN /k\ k = 0, 1, 2, . . . 

and tat is the largest solution in [0, 1] of equation (fTTj) . 

Notice that V\ has a Poisson distribution with parameter mri . To calculate the critical value 
m c N that yields a non-zero solution in [0, 1] of equation we first notice that the product 
y = m c N T^ satisfies the equations 

N-i 

yN /{N _ 1)l+ J2yi/j\=ey; (12) 
3=0 

see Pakes and Dekking (1991), p. 365. Following their way of calculation, one can find m c N and 
r^- by substituting the solution of (fT2|) into 

my N ~ l /{N -l)\ = e y . (13) 

In case of binary trees, one can also use the Cayley's tree function y(z) = Y^kLi k k ~ 1 z k /k\ (see 
e.g. Odlyzko (1995), Section 6.2) evaluated at z = ^/m c N for the solution of (|12|) . Inserting it 
into (|12j) . we obtain = 3.3509 and r| = 0.5352. 

Our final remark concerns the case m — > oo. It is easily seen that Proposition 2 and the 
normal approximation of the Poisson distribution imply a local limit theorem for Vn- Moreover, 
Pakes and Dekking (1991) showed that in this case tn — > 1. This enables one to centralize and 
scale the limiting variable Vn in terms of the single parameter m only. 

Table 2 gives the probabilities P(V N = j), j = 0, 1, 2, ... 9 as well as EV N for 2 < N < 5. 
The critical mean values are as follows: = 3.3509, m§ = 5.1494, m| = 6.7993, m\ = 8.3653. 



v N = 





1 


2 


3 


4 


5 


6 


7 


8 


9 


>10 


E(V N ) 


N = 2 








0.01 


0.04 


0.11 


0.19 


0.22 


0.19 


0.13 


0.07 


0.04 


6.25 


N = 3 





0.01 


0.09 


0.25 


0.32 


0.22 


0.08 


0.02 








0.01 


4.00 


N = 4 





0.05 


0.30 


0.41 


0.19 


0.04 














0.01 


2.87 


N = 5 





0.17 


0.51 


0.28 


0.04 




















2.19 



Table 2: Probability distribution of Vn assuming Poisson offspring with m = 13. 
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5 One-or-many offspring 



In this section we consider a two-parameter family of 1-or-r offspring distributions denned for 
some p G (0, 1) by p\ = 1 — p and p r = p, where r > N > 1. Its pgf is f(s) = (1 — p)s + ps r 
and thus f'(s) = 1 — p + prs r_1 and f^(s) = pr(r - 1) . . . (r - k + l)s r_fe (2 < fc < r). The 
probability tn is the largest solution in [0, 1] of 

(see again Pakes and Dekking (1991), p. 366). Applying (J2J) it is not difficult to obtain 

JV-l 



/ \ 

p(v N = o) = i- P +pj2 y k ) T N^-T N ) 



and for j = 1,2,... and r > jN 

jN+U 



p(v N =j)= P £ , T&(i-7*r-* 

fe=7iV \ K / 



where U = minjA^ — 1, r — jN}. Let B r (Tj^) denote a binomial (r, tjv) random variable. 

Proposition 3 If the offspring pgf is /(s) = (1 — p)s +ps r (1 < N < r) and rjy is the largest 
solution in [0, 1] of l(Hj) . then P(V/v = 0) = 1 - p + pP{B r {r N ) < N — 1) and for j = 1,2,... 

P(Vn = j) = pP(jN < B r (r N ) < jN + U) if jN < r, (15) 

where U = min{A — 1, r — jN} and P(V/v = i) = if jN > r. The expected value of V/v is 

[r/N] 

EV N =pJ2 j p UN < B r {r N ) < jN + U), 

3=1 

where [x] is the integer part of x. 

In particular, ifr = A+lorr = A + 2 and N > 2, then (|15|) implies that V}v takes on values 
or 1; if N = 2 and r = 4, then Vn takes on values 0, 1, or 2. Table 3 provides some numerical 
illustrations. Note that the offspring mean m = 13.09 enables comparisons with Tables 1 and 2. 

It is interesting to point out the following relationship between the 1-or-r and Poisson off- 
spring cases. There exists (see Pakes and Dekking (1991)) a critical value p c N such that for 
p = p c N equation (jTljl has a single solution in (0, 1). Suppose that lim r _ >oc (rr^) — > y, where 
y satisfies (|13|) and ()12|) . Then, applying Theorem 7, Pakes and Dekking (1991), one can obtain 
that Vjv(r) converges in distribution to V/v(y), where Vjv(r) and V5v(y) are copies of Vjv assuming 
one-or-many and Poisson offspring with mean m c N , respectively. 
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v N = 





1 


2 


3 


4 


5 


6 


7 


E(V N ) 


N = 2 


0.07 














0.06 


0.53 


0.34 


5.86 


N = 3 


0.07 








0.06 


0.87 











5.05 


TV = 4 


0.07 





0.06 


0.87 














2.73 


N = 5 


0.07 





0.93 

















1.86 



Table 3: Probability distribution of Vjv assuming l-or-14 offspring with p = 0.93 (m = 13.09). 
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